# New Jersey

* Data were obtained from the New Jersey Department of State at https://web.archive.org/web/20220705162030/https://nj.gov/state/elections/election-information-2020.shtml#general 

* The original `pdf` files are in the `original` folder

* The difficulty of transcription in this case demanded a largely manual process. The research team attempted to transcribe the original `pdf` files into `csv` files using Able2Extract, but could not obtain usable structured data in this way. For example, the `pdf` files for Mercer counties 2 audited districts include over 500 pages in largely tally-based formats, interspersing audited totals with official vote totals.

* So, a mixture of 3 approaches was used. For Mercer county, workers on Amazon MTurk were hired to enter the relevant data. Their work, which can be seen in its original form in the `transcribed` folder, was cross-checked with samples that the research team coded independently. For Cape May county, Able2Extract produced an intermediate document (also in the `transcribed` folder) which was then manually fashioned into the desired format. For all other counties that produced eligible data, the team typed them manually from the original `pdf` into the desired format.

* The file `nj_cleaned.csv` in the `ready` folder combines results from all counties. The file `mercer_cleaned.csv` is the cleaned and reshaped Mercer county data transcribed from the original `pdf` files by the MTurk workers.
